A Robust, Diverse and Challegning Benchmark for Measuring Cultural Knowledge of LLMs
Kelly Chiu PRO
kellycyy
AI & ML interests
None yet
Recent Activity
updated
a dataset
11 days ago
kellycyy/AIRiskDilemmas
commented on
a paper
11 days ago
Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values
Prioritization with AIRiskDilemmas
Organizations
Collections
1
Papers
1
models
0
None public yet
datasets
5
kellycyy/AIRiskDilemmas
Viewer
•
Updated
•
42.6k
•
175
kellycyy/daily_dilemmas
Viewer
•
Updated
•
17.7k
•
118
•
3
kellycyy/CulturalBench
Viewer
•
Updated
•
6.14k
•
677
•
4
kellycyy/wildentities_classify
Viewer
•
Updated
•
8.61k
•
9
kellycyy/wildchat-factual-classify
Viewer
•
Updated
•
8.53k
•
10